Abstract: A main encounter in video segmentation is that the foreground object may move rapidly in the scene at the same time its presence and shape changes over time. While pairwise potentials used in graph-based algorithms support smooth labels between neighboring (super) pixels in space and time, they proposal only a myopic view of consistency and can be misled by inter-frame optical flow errors. We propose a higher order supervoxel label consistency potential for semi-supervised foreground segmentation. Given an initial frame with manual annotation for the foreground object, our approach propagates the foreground region through time, leveraging bottom-up supervoxels to guide its evaluations towards long-range coherent regions. We endorse our approach on three challenging datasets and complete state-of-the-art results.
Keywords: video segmentation, foreground region, semi-supervised foreground segmentation, coherent regions.